Docker快速搭建Clickhouse集群(3分片3副本) 您所在的位置:网站首页 clickhouse geo Docker快速搭建Clickhouse集群(3分片3副本)

Docker快速搭建Clickhouse集群(3分片3副本)

#Docker快速搭建Clickhouse集群(3分片3副本)| 来源: 网络整理| 查看: 265

背景

前文简单介绍了下Clickhouse的安装和客户端使用,在实际生产环境中,Clickhouse常常是以集群模式部署的,由于很多系统不满足sse4.2指令,这里使用docker来搭建一个Clickhouse的集群。

1. 环境说明 1.1 机器列表

机器名

IP

配置

操作系统

部署的服务

备注

server01

192.168.21.21

8c8g

centos7.3

clickhouserver(cs01-01)和

clickhouserver(cs01-02)

   clickhouse01-01: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片1, 副本1

     clickhouse01-02: 实例2, 端口: tcp 9001, http 8124, 同步端口9010, 类型: 分片2, 副本2 (clickhouse2的副本)

server02

192.168.21.69

8c8g

centos7.3

clickhouserver(cs02-01)和

clickhouserver(cs02-02)

 clickhouse02-01: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片2, 副本1

     clickhouse02-02: 实例2, 端口: tcp 9001, http 8124, 同步端口9010, 类型: 分片3, 副本2 (clickhouse3的副本)

server03

192.168.21.6

8c8g

centos7.3

clickhouserver(cs03-01)和

clickhouserver(cs03-02)

clickhouse03-01: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片3, 副本1

     clickhouse03-02: 实例2, 端口: tcp 9001, http 8124, 同步端口9010, 类型: 分片1, 副本2 (clickhouse1的副本)

1.2 机器初始化 1.2.1 配置host

执行:vi /etc/hosts, 加入下面三行:

192.168.21.21 server01

192.168.21.69 server02

192.168.21.6 server03

1.2.2 安装docker

每台机器上均安装相同版本的docker

参照:docker环境搭建

1.2.3 安装Zookeeper

参照:zookper集群搭建(3节点)安装

1.3 目录初始化 1.3.1 创建对应本地路径

分别在三台服务器,创建数据存储目录: 

mkdir /data/clickhouse

1.3.2 获取clickhouse-server的配置

1)在server01服务器做如下操作

先按照官方教程的docker命令启动Clickhouse-Server

docker run -d --name clickhouse-server --ulimit nofile=262144:262144 --volume=/data/clickhouse/:/var/lib/clickhouse yandex/clickhouse-server

2)启动完成后,复制容器内的配置文件到本机目录下

#拷贝容器内容的配置到/etc目录下 docker cp clickhouse-server:/etc/clickhouse-server/ /etc/ #将server01上的目录重命名 cp -rf /etc/clickhouse-server/ /etc/clickhouse-server01/ cp -rf /etc/clickhouse-server/ /etc/clickhouse-server02/

3)然后将/etc/clickhouse-server/ 分别拷贝到每个机器上

#拷贝配置到server02上 scp /etc/clickhouse-server/ server02:/etc/clickhouse-server01/ scp /etc/clickhouse-server/ server02:/etc/clickhouse-server02/ #拷贝配置到server03上 scp /etc/clickhouse-server/ server03:/etc/clickhouse-server01/ scp /etc/clickhouse-server/ server03:/etc/clickhouse-server02/ 2. 集群环境搭建 2.1 集群环境拓扑图

集群环境说明:

clickhouse01-01: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片1, 副本1

clickhouse01-02: 实例2, 端口: tcp 9001, http 8124, 同步端口9010, 类型: 分片2, 副本2 (clickhouse2的副本)

clickhouse02-01: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片2, 副本1

clickhouse02-02: 实例2, 端口: tcp 9001, http 8124, 同步端口9010, 类型: 分片3, 副本2 (clickhouse3的副本)

clickhouse03-01: 实例1, 端口: tcp 9000, http 8123, 同步端口9009, 类型: 分片3, 副本1

clickhouse03-02: 实例2, 端口: tcp 9001, http 8124, 同步端口9010, 类型: 分片1, 副本2 (clickhouse1的副本)

2.2 配置集群 2.2.1 待修改的配置文件

需要修改的配置有两个(如果有需要也可以配置user.xml):

/etc/clickhouse-server/config.xml/etc/clickhouse-server/metrika.xml(新增文件) 2.2.2 server1上配置clickhouse-server的实例

2.2.2.1 clickhouse-01-01的配置:

1)/etc/clickhouse-server01/config.xml(其他实例此配置内容和这个一样就行)

修改include from节点为实际的引用到的文件

/etc/clickhouse-server/metrika.xml 0.0.0.0 127.0.0.1

2)/etc/clickhouse-server01/metrika.xml(所有实例的配置内容都和这个一样就行)

true server01 9000 default server03 9001 default true server02 9000 default server01 9001 default true server03 9000 default server02 9001 default 192.168.21.66 2181 192.168.21.57 2181 192.168.21.17 2181 01 01 cluster01-01-1 ::/0 10000000000 0.01 lz4

2.2.2.2 clickhouse-01-02的配置:

1)/etc/clickhouse-server02/metrika.xml(所有实例的配置内容都和这个一样就行)

true server01 9000 default server03 9001 default true server02 9000 default server01 9001 default true server03 9000 default server02 9001 default 192.168.21.66 2181 192.168.21.57 2181 192.168.21.17 2181 01 02 cluster01-02-2 ::/0 10000000000 0.01 lz4 2.2.3 server2上配置clickhouse-server的实例

2.2.3.1 clickhouse-02-01的配置:

/etc/clickhouse-server01/metrika.xml

true server01 9000 default server03 9001 default true server02 9000 default server01 9001 default true server03 9000 default server02 9001 default 192.168.21.66 2181 192.168.21.57 2181 192.168.21.17 2181 01 02 cluster01-02-1 ::/0 10000000000 0.01 lz4

2.2.3.2 clickhouse-02-02的配置:

/etc/clickhouse-server02/metrika.xml

true server01 9000 default server03 9001 default true server02 9000 default server01 9001 default true server03 9000 default server02 9001 default 192.168.21.66 2181 192.168.21.57 2181 192.168.21.17 2181 01 03 cluster01-03-2 ::/0 10000000000 0.01 lz4 2.2.4 server3上配置clickhouse-server的实例

2.2.4.1 clickhouse-03-01的配置:

1)/etc/clickhouse-server01/metrika.xml

true server01 9000 default server03 9001 default true server02 9000 default server01 9001 default true server03 9000 default server02 9001 default 192.168.21.66 2181 192.168.21.57 2181 192.168.21.17 2181 01 03 cluster01-03-1 ::/0 10000000000 0.01 lz4

2.2.4.2 clickhouse-03-02的配置:

1. /etc/clickhouse-server02/metrika.xml

true server01 9000 default server03 9001 default true server02 9000 default server01 9001 default true server03 9000 default server02 9001 default 192.168.21.66 2181 192.168.21.57 2181 192.168.21.17 2181 01 01 cluster01-01-2 ::/0 10000000000 0.01 lz4   2.3 运行clickhouse集群 2.3.1 运行clickhouse01-01实例

登陆到server01:ssh server01

#先删除cs01-01容器 docker rm -f cs01-01 #运行Clickhouse-server docker run -d \ --name cs01-01 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse01/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server01/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --hostname $(hostname) \ -p 9000:9000 \ -p 8123:8123 \ -p 9009:9009 \ yandex/clickhouse-server

说明: 

--add-host参数:因为我们在配置文件中使用了hostname来指代我们的服务器,为了让容器能够识别,所以需要加此参数,对应的host配置会自动被添加到容器主机的/etc/hosts里面

--hostname参数:clickhouse中的system.clusters表会显示集群信息,其中is_local的属性如果不配置hostname的话clickhouse无法识别是否是当前本机。is_local都为0的话会影响集群操作,is_local通过clickhouse-client登录到任一clickhouse-server上查看:SELECT * FROM system.clusters;

--p参数:暴露容器中的端口到本机端口中。

2.3.2 运行clickhouse01-02实例

登陆到server01:ssh server01;

docker run -d \ --name cs01-02 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse02/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server02/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --hostname $(hostname) \ -p 9001:9000 \ -p 8124:8123 \ -p 9010:9009 \ yandex/clickhouse-server 2.3.3 运行clickhouse02-01实例

登陆到server02:ssh server02

docker run -d \ --name cs02-01 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse01/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server01/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --hostname $(hostname) \ -p 9000:9000 \ -p 8123:8123 \ -p 9009:9009 \ yandex/clickhouse-server 2.3.4 运行clickhouse02-02实例

登陆到server02:ssh server02

docker run -d \ --name cs02-02 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse02/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server02/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --hostname $(hostname) \ -p 9001:9000 \ -p 8124:8123 \ -p 9010:9009 \ yandex/clickhouse-server 2.3.5 运行clickhouse03-01实例

登陆到server03:ssh server03

docker run -d \ --name cs03-01 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse01/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server01/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ --hostname $(hostname) \ -p 9000:9000 \ -p 8123:8123 \ -p 9009:9009 \ yandex/clickhouse-server 2.3.6 运行clickhouse03-02实例

登陆到server03:ssh server03

docker run -d \ --name cs03-02 \ --ulimit nofile=262144:262144 \ --volume=/data/clickhouse02/:/var/lib/clickhouse \ --volume=/etc/clickhouse-server02/:/etc/clickhouse-server/ \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --hostname $(hostname) \ -p 9001:9000 \ -p 8124:8123 \ -p 9010:9009 \ yandex/clickhouse-server 2.4 clickhouse集群的数据操作 2.4.1 运行客户端连接clickhouse server

1)随便在哪一台实例的机器上执行如下(连不同的clickhouse实例只需要改下host和port的值即可)

docker run -it \ --rm \ --add-host server01:192.168.21.21 \ --add-host server02:192.168.21.69 \ --add-host server03:192.168.21.6 \ yandex/clickhouse-client \ --host server01 \ --port 9000

2)执行下如下命令查看下配置信息:

#需要看下,是否和metrika.xml配置的分片和副本信息一致,如果不一致,需要check下每个clickhouse-server实例的配置

#应该is_local显示为0,且分片和副本信息都正确

SELECT * FROM system.clusters;

2.4.2 创建本地复制表和分布式表

所有实例配置完上面这些之后,分别执行启动命令启动,然后所有实例都执行下面语句创建数据库:

例如在实例01-02上执行:

#执行Clickhouse client进入Clickhouse docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server01 --port 9001 ClickHouse client version 19.17.5.18 (official build). Connecting to server01:9001 as user default. Connected to ClickHouse server version 19.17.5 revision 54428. i-r9es2e0q :) CREATE DATABASE test; CREATE DATABASE test Ok. 2.4.3 创建复制表

然后对于所有实例分别创建对应的复制表,这里测试创建一个简单的表:

#在clickhouse01-01 9000上执行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server01 --port 9000 #然后执行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-01/device_thing_data','cluster01-01-1') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse01-02 9001上执行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server01 --port 9001 然后执行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-02/device_thing_data','cluster01-02-2') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse02-01 9000上执行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server02 --port 9000 #然后执行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-02/device_thing_data','cluster01-02-1') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse02-02 9001上执行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server02 --port 9001 #然后执行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-03/device_thing_data','cluster01-03-2') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse03-01 9000上执行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server03 --port 9000 #然后执行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-03/device_thing_data','cluster01-03-1') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 #在clickhouse03-02 9001上执行: docker run -it --rm --add-host server01:192.168.21.21 --add-host server02:192.168.21.69 --add-host server03:192.168.21.6 \ yandex/clickhouse-client --host server03 --port 9001 然后执行 CREATE TABLE test.device_thing_data ( time UInt64, user_id String, device_id String, source_id String, thing_id String, identifier String, value_int32 Int32, value_float Float32, value_double Float64, value_string String, value_enum Enum8('0'=0,'1'=1,'2'=2,'3'=3,'4'=4,'5'=5,'6'=6,'7'=7,'8'=8), value_string_ex String, value_array_string Array(String), value_array_int32 Array(Int32), value_array_float Array(Float32), value_array_double Array(Float64), action_date Date, action_time DateTime ) Engine= ReplicatedMergeTree('/clickhouse/tables/01-01/device_thing_data','cluster01-01-2') PARTITION BY toYYYYMM(action_date) ORDER BY (user_id,device_id,thing_id,identifier,time,intHash64(time)) SAMPLE BY intHash64(time) SETTINGS index_granularity=8192 2.4.3 创建分布式表(用于查询)

然后创建完上面复制表之后,可以创建分布式表,分布式表只是作为一个查询引擎,本身不存储任何数据,查询时将sql发送到所有集群分片,然后进行进行处理和聚合后将结果返回给客户端,因此clickhouse限制聚合结果大小不能大于分布式表节点的内存,当然这个一般条件下都不会超过;分布式表可以所有实例都创建,也可以只在一部分实例创建,这个和业务代码中查询的示例一致,建议设置多个,当某个节点挂掉时可以查询其他节点上的表,分布式表的建表语句如下:

#在clickhouse-server集群上一次性创建所有的分布式表,操作卡主了。原因不明 CREATE TABLE device_thing_data_all ON CLUSTER cluster_3s_1r AS test.device_thing_data ENGINE = Distributed(cluster_3s_1r, default, device_thing_data, rand()) #如下这个需要每个机器上都操作一遍 CREATE TABLE device_thing_data_all AS test.device_thing_data ENGINE = Distributed(cluster_3s_1r, test, device_thing_data, rand()) 2.4.4 测试可用性 #客户端连接到某个clickhouse-server实例(例如cs01-01) #查询分布式表,此时没有查询到数据 select * from image_label_all; #在cs01-01上执行如下查看是否有数据 #查询本地复制表,此时没有查询到数据 select * from test.device_thing_data; #往复制表中表里插入一条数据 INSERT INTO test.device_thing_data; (user_id) VALUES ('1') #由于刚才在cs01-01上插入一条数据,所以应该有数据了 select * from test.device_thing_data; #查询分布式表,也有数据了 select * from image_label_all; #在cs03-02上查询复制表的数据(由于cs03-02是cs01-01的副本,所以数据被自动同步过来了),所以应该有数据了 select * from test.device_thing_data; #在cs03-01上插入一条数据,此时应该会把数据同步到cs02-02上 INSERT INTO test.device_thing_data (user_id) VALUES ('2') #再次查询分布式表,此时应该查到cs03-01和cs01-01上的两条数据 select * from image_label_all;

传送门:2021最新测试资料&大厂职位

博主:测试生财(一个不为996而996的测开码农)

座右铭:专注测试开发与自动化运维,努力读书思考写作,为内卷的人生奠定财务自由。

内容范畴:技术提升,职场杂谈,事业发展,阅读写作,投资理财,健康人生。

csdn:https://blog.csdn.net/ccgshigao

博客园:https://www.cnblogs.com/qa-freeroad/

51cto:https://blog.51cto.com/14900374

微信公众号:测试生财(定期分享独家内容和资源)



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有